Commonly, a machine learning approach can be constructed of 4 components: data, a model, a cost function and an optimization procedure. The structure of this notebook also resembles these 4 components.
In Section 0: Imports, Hyperparameters, create directory Structure
In Section 1, is about the data of our approach. We will load our data, preprocess it, split it to trainings data and test data. The trainingsdata is used to train our model, the test data to verify our approach. We will also divide the data in batches.
In Section 2, we will define our model. That is, in our case the model (or function) that we want to learn is a mapping between our input data (power traces) and the output data (the parts of the key that we want to learn)
In Section 3, we will define our cost function. (also frequently called objective or
In Section 4 we will define our trainings procedure. We will use adam, which is a variant of stochastic gradient descent (SGD) that uses momentum.
In Section 5, we will train our model.
In Section 6, we will test our model.
We have a couple of helpers which we use. If a method is not needed for understanding the approach, we will put it there. These are located in library/helpers.py.
The core idea of a template attack is: given a certain power consumption measurment $t$, how likeley is it to have observed certain key $k$? In other words, we are looking for a model that tells us the conditional probability of $p(k|t)$.
If we would try to search over the whole key space, this journey would end very fast, because the key space of an AES128 is $2^{128}$, which is way too large. Instead, we are learning sub models for sub keys. For example, if we target the S-BOX operation of the AES algorithm, we would need to learn 16 models for 16 subkeys,
What we need to do for that:
Our target data:
Our input data:
# model 1
x_train_input = [
[12, 14, 256, 13] # power sub trace
]
y_train_target = [
[AE] # target sub key byte
]
In [ ]:
In [1]:
import tensorflow as tf
assert(tf.__version__=="1.2.0") # make sure we have the right tensorflow version
import numpy as np
import os
import logging
import library.helper as h
from IPython.display import Image # displaying images in ipython
In [2]:
# configure numpy
np.set_printoptions(precision=2)
np.random.seed(0)
# configure
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# configure ipython display
def show(img_file):
try: # only works in ipython notebook
display(Image(filename=img_file))
except:
pass
In [ ]:
def create_dir(path):
if not os.path.exists(path): os.mkdir(path)
create_dir("data") # power traces go here
create_dir("graph") # store tensorflow calc graph here
create_dir("visualizations") # charts we generate
In [ ]:
RANDOM_SEED = 0
Index format:
In [ ]:
In [ ]:
h.download_and_unpack_data(key=1) # for which key to download the data. possible [0-15]
operations = h.load_operations(key=1)
for i, k_v in enumerate(operations.items()):
if i>5: break
print(k_v[0], k_v[1])
In [ ]:
power_trace = h.load_power_trace("data/DPA_contestv4_2/k00_uncompressed/DPACV42_000000.trc")[0:1000]
h.visualize_power_trace(power_trace, steps=1000, out_name="trace1")
power_trace2 = h.load_power_trace("data/DPA_contestv4_2/k00_uncompressed/DPACV42_000001.trc")[0:1000]
h.visualize_power_trace(power_trace2, steps=1000, out_name="trace2" )
power_trace3 = h.load_power_trace("data/DPA_contestv4_2/k00_uncompressed/DPACV42_000002.trc")[0:1000]
h.visualize_power_trace(power_trace3, steps=1000, out_name="trace3" )
sum_trace = (power_trace + power_trace2 + power_trace3) / 3
diff = abs(power_trace-sum_trace)
h.visualize_power_trace(diff, steps=1000, out_name="difference-1-mean" )
h.visualize_power_trace(sum_trace, steps=1000, out_name="sumtrace" )
#print(power_trace[0:100])
#print(power_trace.mean())
#power_trace = power_trace - power_trace.mean()
#show(os.path.join("visualizations", power_trace_img))
In [10]:
num_points_per_trace = 1000 #1704403
uncompressed_traces_dir = "data/DPA_contestv4_2/k%0.2d_uncompressed/"%0
uncompressed_traces = sorted(list(os.listdir(uncompressed_traces_dir)))
num_traces = len(uncompressed_traces)
unprocessed_traces_name = "data/unprocessed_traces.mm"
# write power traces to memmap
if not h.file_exists(unprocessed_traces_name):
logger.info("Populating "+unprocessed_traces_name+", this might take a while.")
fp = np.memmap(unprocessed_traces_name, dtype='int8', mode='w+', shape=(num_traces,num_points_per_trace))
for i, trace_name in enumerate(uncompressed_traces):
if i%100==0:
logger.info("%0.3d/%i traces processed..."%(i,num_traces))
trace = h.load_power_trace(os.path.join(uncompressed_traces_dir,trace_name),limit=1000)
fp[i,:]=trace.flatten()
else:
logger.info(unprocessed_traces_name +" already exists, delete it for regeneration.")
fp = np.memmap(unprocessed_traces_name, dtype='int8', mode='r', shape=(num_traces,num_points_per_trace))
In [6]:
fp[0:10,0:20]
h.visualize_power_trace(fp[0:5,0:1000], steps=1000, out_name="5-traces-0-1000" )
In [ ]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import animation, rc
from IPython.display import HTML
# First set up the figure, the axis, and the plot element we want to animate
fig, ax = plt.subplots()
#ax.set_xlim(( 0, 1000))
#ax.set_ylim((-128, 128))
line, = ax.plot([], [], lw=2)
# initialization function: plot the background of each frame
def init_power_chart():
line.set_data([], [])
return (line,)
# animation function. This is called sequentially
def animate(i):
stride = 1000/100
x = np.linspace(i*stride , (i+5)*stride, 50 )
y = fp[0,int(i*stride):int((i+5)*stride)]
line.set_data(x, y)
return (line,)
# call the animator. blit=True means only re-draw the parts that have changed.
anim = animation.FuncAnimation(fig,
animate,
init_func=init_power_chart(),
frames=95, interval=20, blit=False)
HTML(anim.to_html5_video())
In [5]:
h.visualize_power_trace(traces_column_mean, steps=1000, out_name="mean")
h.visualize_power_trace(abs(fp[0,:]-traces_column_mean), steps=1000, out_name="trace-1-mean-abs" )
h.visualize_power_trace(fp[0,:]-traces_column_mean, steps=1000, out_name="trace-1-mean" )
h.visualize_power_trace(abs(fp[1,:]-traces_column_mean), steps=1000, out_name="trace-2-mean-abs" )
h.visualize_power_trace(fp[1,:]-traces_column_mean, steps=1000, out_name="trace-2-mean" )
In [ ]:
x_1_reduced = x_1[np.where(abs(x_1) > 24)]
len(x_1_reduced)
In [ ]:
x_2 = fp[1] - m_t
x_2_reduced = x_2[np.where(abs(x_2) > 24)]
len(x_2_reduced)
In [ ]:
h.visualize_power_trace(x_1, steps=5000)